Tag

#coding agents

9 articles

Researchers escaped four top AI coding agents’ sandboxes without ever breaking them

Researchers have demonstrated how AI coding agents like Cursor, Codex, and Gemini CLI can be 'escaped' from their sandboxes without actually breaking them, highlighting a critical security flaw.

Jul 215

Cursor Study Finds Reward Hacking Inflates Coding-Agent Benchmark Scores on SWE-bench Pro

Learn to build a code agent evaluation system that detects reward hacking in benchmarking, where agents retrieve known fixes instead of deriving solutions.

Jun 2628

research

Anthropic study finds men use AI coding agents more than twice as often as women in social science research

A new Anthropic study reveals that men use AI coding agents more than twice as often as women in social science research, highlighting a significant gender gap in AI tool adoption.

May 3163

tools

Warp’s big bet on building open source with GPT-5.5

Warp integrates OpenAI's GPT-5.5 model to enhance coding agents across local, cloud, and open-source development workflows. The move positions Warp as a key player in AI-powered development tools.

May 2765

Best AI Agents for Software Development Ranked: A Benchmark-Driven Look at the Current Field

Learn how to set up a benchmarking framework to evaluate AI coding agents like Claude Code and GPT-5.5, similar to industry benchmarks used in 2026.

May 1455

Running Codex safely at OpenAI

OpenAI details its comprehensive security approach for running Codex, including sandboxing, network policies, and agent-native telemetry to support safe and compliant AI coding agent adoption.

May 852

Alibaba Qwen Team Releases Qwen3.6-27B: A Dense Open-Weight Model Outperforming 397B MoE on Agentic Coding Benchmarks

Alibaba's Qwen team has released Qwen3.6-27B, a dense open-weight model outperforming 397B MoE on agentic coding benchmarks. It introduces a Thinking Preservation mechanism and a hybrid attention architecture.

Apr 2279

Andrew Ng’s Team Releases Context Hub: An Open Source Tool that Gives Your Coding Agent the Up-to-Date API Documentation It Needs

Learn how to install and use Context Hub, an open-source tool that keeps coding agents updated with the latest API documentation from Andrew Ng's team at DeepLearning.AI.

Mar 993

research

New ETH Zurich Study Proves Your AI Coding Agents are Failing Because Your AGENTS.md Files are too Detailed

Learn how too much detail in AI coding instructions can actually hurt performance, according to a new ETH Zurich study. Understand the concept of context engineering and why less can sometimes be more when guiding AI systems.

Feb 25124